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1.0 I NTRODUCTION 

This report will outline the data processing techniques to be 
studied for use in infrared astronomy data analysis systems. 

The ensuing investigation will be restricted to consideration 
of data from space-based telescope systems operating as survey 
instruments. Resulting algorithms, and in some cases specific 
software, will be applicable for use with the Infrared Astronomy 
Satellite (IRAS) and the Shuttle Infrared Telescope Facility 
(SIRTF). Operational tests will be made during the investigation 
using data from the Celestial Mapping Program (CMP). The 
overall task is somewhat different from that involved in 
ground-based infrared telescope data reduction. 

Section 2.0 reviews the characteristics of space-based survey 
data and the differences between that and ground-based data. 
Sections 3.0 and 4.0 then discuss the processing task needed 
for point sources and extended sources, respectively. Section 
5.0 considers the overall software/ hardware data processing 
system involved, and Section 6.0 concludes this report vmth a 
reference list including a number of representative texts 
related to the data processing task. 
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2.0 DATA FROM INFRARED OBSERVATIONS 

This section reviews the techniques of infrared astronomical 
measurement and the resulting data streams. Included are 
descriptions of representative space survey systems and the 
resulting data collected by one of them. A three-level 
division of Infrared source data is described based on the 
divergence in data processing approaches created by physical 
differences in the astronomical sources. 

The application of the data reduction techniques discussed 
in this report is limited for the most part to the processing 
of survey measurements. A primary requirement of survey 
analysis is the discovery of unknov/n but physically real 
Infrared sources and the determination of their positions and 
intensities. Other photometric studies, on the other hand, are 
intended to measure to high accuracy the intensities and 
spectral characteristics of kno\/n sources. Survey data is 
intrinsically statistical in nature in that a tradeoff occurs 
between the accuracy of a measurement (existence, position, 
intensity), the observation schedule, and the data processing 
techniques, which gives a non-zero false detection rate for 
maximum information transfer. Optimizing this information on 
the basis of some defined set of criteria is the goal of the 
data processing system and has direct implications on the 
design of the sensor. 
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2.1 Infrared Astronomical Measurement Techmques 

Infrared astronomical measurements are essentially photometric 
in nature rather than image-oriented. That is, a measurement 
of the infrared radiation from a specific direction is made by a 
collection of mechanical, optical, and electrical components, 
which results in an electrical signal related to the incident 
infrared intensity. The temporal sequence implicit in this 
electrical signal is produced by some induced variation in the 
infrared illumination on the detectors. Most ground-based 
infrared astronomical systems utilize controlled optical beam 
switching which alternately illuminates the detector(s) with 
the radiation' from two different regions. Most space-based 
systems utilize directional scanning to illuminate the 
detectors with radiation from a sequence of positions. A 
number of variations on these two approaches is used, and 
applications are not exclusively ground- or space-based for 
one or 'the other, but two different types of data streams 
result from the described approaches. This report will directly 
address the processing task for the second data gathering 
lechnique. For comparison, however, a general approach to 
beam switching data gathering follows. 

2.1.1 Ground-Based Infrared Observations 

Ground-based telescopes realize beam switching by oscillating 
one of the telescope's optical components, usually the secondary 
mirror of a Cassegrain telescope. The modulation frequency is 
chosen~£i±her.for^ptimum detector response or. to minimize the 
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effects of spatial and temporal variations in atmospheric 
emission. Small fields of view and small beam separations are 
used to minimize the effects of this sky noise. Such an 
approach allows high accuracy in photometric measurement but 
strongly discriminate against extended sources or low brightness 
gradients, and also is at odds with survey requirements of rapid 
area coverage. 

Further, these oscillating secondary mirrors and detector dewars 
are commonly installed on telescopes initially designed for 
visual photography. The secondary oscillation commonly 
induces a signal due to side-lobe emissions of the telescope 
structure which limits the system performance level. This is 
partially treated by using undersized secondary mirrors, thus 
v/asting some fraction of the collected photons. An oscillating 
primary mirror was used in the 2.2 micron survey of Neugebauer 
and Leighton^ to avoid this difficulty. 

The modulated radiation is transferred to a cryogenic detector, 
passing through one or more spectral filters. It is common to 
use two filters, one acting as the window to the cryogenic 
dewar and a second one internal to the dewar, cooled to the 
detector temperature to reduce the thermal emission from it to 
the detector. Even with this approach the radiant flux within 
the spectral bandpass is dominantly sky photons and the 
detector materials are restricted to high background flux 
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types. This is because the detector always sees either bright 
sky or bright sky plus dim stars in a beam sv/itching system. 

The square wave signal from the detector is amplified by a 
low-noise A.C, coupled amplifier mounted within or immediately 
outside the dewar assembly. For either case, a load resistor 
IS usually mounted on the cold sink v;ithin the dewar to minimize 
its thermal noise. The signal is then rectified by a phase- 
locked amplifier synchronized to the secondary mirror 
oscillations and integrated until the signal -to-noise ratio 
has reached an acceptable level. 

The measured voltage is then calibrated by observing standard 
stars shortly before or after the experimental measurement. 

These standards are chosen to be nearby the measurement to 
minimize the effects of air mass and directional variations in 
atmospheric transmission. Positions are determined from the 
outputs of the setting circles of the telescope and from offsets 
of known stars. 

A number of aspects of this approach limits the usefulness For 

sky surveying. To achieve some uniformity in survey operation 

the telescope is generally scanned slowly with the dwell time of 

a star on a detector determining the integration period and 

defining the sensitivity limit. In this manner, the Neugebauer- 

Leighton survey stretched over a period of three years measuring 

almost 5600 'sources in a declination band between -33° and +81° 

-15 -2 -1 

brighter than 2.5x10 watts cm ym at 2.2 ym. An attempt 
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2 3 

to survey at 11 urn was made by Low, ’ observing in a narrow 

spectral window of the atmosphere. His best results were at a 

"16 "2 *"1 

sensitivity of 2x10“ watts on" ym“ at a rate of 3.8 square 
degrees per hour. This implies a period of two years for 
single-measurement coverage of a major fraction of the sky. 

Even accepting a very slow rate of coverage and unimpressive 
sensitivity, ground-based surveys are limited by their inability 
to discover even slightly extended objects. A number of sources 
extending 4 to 5 arc minutes located by a sounding rocket 
survey are unmeasurable by current ground-based telescopes even 
when photographic identification of some of the sources has been made. 

Finally, data analysis in these systems is currently a manual 
task, and extensive system expansions would be needed to make 
even the collection of position and brightness information 
automatic. Furthermore, to channel this data into computer 
systems capable of handling the complexity and size of the data 
analysis task would make such an effort unacceptably costly in 
both dollars and facilities for such limited scientific output. 

No further discussion of ground-based systems or data processing 
will be made except for occasional fortuitous transfers from 
space survey systems and the techniques used on their data. 

2.1.2 Space-Based Infrared Measurements 

When a survey instrument is raised above the atmosphere, 
tremendous gams are realized in capability and simplicity. 

The background photon flux and sky noise are eliminated, and 
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measurements can be made in spectral regions inaccessible to 
ground-based telescopes. Furthermore, since frosting is not a 
problem in space, the entire telescope system can be cooled to 
greatly reduce the background from the instrument itself. Under 
such lov/ background conditions, infrared detectors exhibit very 
high detectivities. The very short time constants of these 
detectors permit high scan rates for temporal frequency 
selection eliminating the complexities of oscillating 

4 5 6 

components. The AFGL Infrared Celestial Survey Program ’ ’ is 

representative of previous space survey efforts and is described 
below. 

The AFGL survey was performed using a small cryogemcally cooTed 
sounding rocket-borne telescope. The instrument was a 16.5 cm 
diameter folded Gregorian equipped vnth internal baffles and 
aperture stops to minimize side-lobe response and radiation 
from the telescope structure with all optical components 
cooled by liquid helium to around 15®K. Interference filters 
selectively isolated different portions of the linear 
staggered detector array along the direction scan. This 
permitted almost simultaneous measurements in three spectral 
bands with effective wavelengths of 4.2, 11.0, and 19.8 pm 
with bandwidths of 1.5, 5.1, and 5.6 ym, respectively. 

The field-of-view for each detector was 3.4 arc minutes in the 
scan direction and 10.5 arc minutes in the cross scan direction. 
To insure complete scan coverage each detector was overlapped 



- 8 - 


by adjacent elements in each color. This reduced the effective 
spatial resolution to 3.4 by 7.1 arc minutes for the non-overlapped 
portion and 3.4 by 1.7 arc minutes for the overlapped portion. 

The telescope was yoke-mounted in a rocket fixed alt-azimuth 
coordinate system. During the flight the telescope azimuth axis 
was actively fixed in celestial coordinates to within 12 arc 
seconds by means of a visual star tracker coupled to a cold 
gas attitude control reaction system. The zenith position of 
the telescope line of sight was read to +30 arc seconds by a 
digital optical encoder mounted on that axis. Azimuthal 
positions were obtained from the output of a visual stellar 
aspect sensor and scan rate gyro to 1 arc minute. 

Infrared sources transiting a detector generated electrical 
signals which were then amplified, bandlimited, sampled, 
digitized, and transmitted to the ground on a PCM telemetry 
link. Simultaneously, the outputs of the position control 
sensors were sampled, digitized, and merged with the detector 
data in the telemetry link. Time tags were added to the data 
from a crystal controlled reference clock at the- ground station. 

2,2 Sample Data Streams 

To illustrate the complexity of the data processing task, 
several examples of raw data from a space survey are presented. 
Interpretation of these data records is aided by an understanding 
of the-focal plane-Tayout. There are eight detector.s i.n each 
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of three groups, each group measuring a different spectral 
region. For a given cross scan position one detector in each 
color IS used, separated by a small distance in the scanning 
direction; the eight detectors are slightly overlapped in the 
cross scan direction. A point source will thus pass through 
one or two detectors in each color in a single scan producing 
pulses with a well-defined time lag in successive detectors. 
Sample data are shown in Figures 1 through 3. Each line is the 
signal from one detector displayed as a function of scan 
azimuth (time). For clarity of presentation and interpretation, 
the records are grouped into triplets of detectors, one in each 
of three colors; the eight groups are the cross-scan divisions of 
the detector array. Time and amplitude scales are the same in 
all figures. One channel of the bottom group is omitted from 
all figures because that detector was malfunctioning and not 
considered an element of the survey. 

Figure 1 is archetypal of the star survey data task. Prominent 
in the second group from the bottom is the three-color 
signature from a bright star showing the characteristic time 
stagger of a real source transit. It is important to note, 
however, that this is a very strong signal. While the actual 
signature of a star transit is determined by the focal plane 
design and the electronics system, the illustrated signal is 
a typical response of a system optimized for point source 
detection. The third color measurement here (bottom trace of 
'•this group) i^ -to the eye near the limit of detect! on. although 



L4 

2Q 

5 

11 

21 


6 

15 

22 



Figure 1. Sample Data - Three Color Star Measurement 
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lts actual peak signal -to- rms noise value is over six. Since 
statistically sigmfi'cant numbers of real sources can be 
detected with individual measurements having S/N values as 
low as three, it is obvious that manual analysis will miss a 
significant portion of the most interesting sources. 

Furthermore, the illustrated data, which is a plot of the 
digital sample sequence, is oversampled by a factor of four 
from the minimum necessary to identify a signal at a 90% 
confidence level. If constraints in another system require a 
minimum sampling rate, it is fully possible for two consecutive 
samples to bracket tha true signal peak, thus underestimating 
the peak value by 1/2 the ratio of sample rate and rise time 
times the digitizing step size. This significantly constrains 
the photometric accuracy for manual analysis approaches. 

On the other hand, numerical detection techniques can be 
constructed which operate very v/ell at low S/N levels with a 
false alarm rate which is a smooth function of the noise 
characteristics. Additionally, numerical methods can easily 
make best estimates of amplitudes by convolution with model 
signatures, allowing smaller photometric uncertainties. 
Secondary analysis and reconfirming observations can then be 
used to reduce the false alarm rate without losing the real 
but weak sources that eyeball analysis would always miss. 
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Many of the noise characteristics of survey data are also in 
evidence in Figure 1. The noise is the key to source detection, 
and a thorough understanding of its characteristics is 
mandatory for efficient data analysis. Of course, the most 
important elements of i;he noise character, its amplitude and 
frequency spectra, are not easily comprehended from the 
illustration. Those subjects will be covered in later reports. 

Several important elements are evident, however. In a number 
of the traces, the noise amplitude is seen to vary. (This is 
especially evident if Figures 2 and 3 are also consulted.) 

This nonstationary amplitude variation implies a variable 
false alarm rate for fixed detection gates which complicates 
the task of creating a uniformly complete survey. In some 
portions of several traces, the signal is seen to go "flat.” 

In these periods the noise has Fallen below the digitizing 
step level and only the noise peaks appear. Such flat 
segments could lead to anomalously low rms values and further 
surges in the false source rates. Since it is wise to choose 
digitizing steps comparable to the noise level for best 
dynamic range and other considerations, the resulting contribu- 
tion to the noise character by the digitizing process must be 
thoroughly accounted for. 

A final caveat in the data task is illustrated by the signature 
in the topmost trace. Here, an apparently strong signal occurs 
1 -n only 'One color of -the group. This does not haved;he 
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character of one of the known types of false signals, yet some 
anomalies can be noted. First, the signal form is not a true 
match to a point source signature. Second, although it is very 
large, no signal is seen in the other tv/o spectral bands. 

Since the signal is seen on the middle of the three spectral 
bands, the object must have an extremely nonthermal spectra if 
It is real. Since the noise is also seen xo be variable 
imnediately prior to the signal, there is much uncertainty to 
be associated with it. Such attributes must be measured in a 
comprehensive data processing system in order to provide later 
stage software routines vnth enough information to make 
consistent deci'sions. 

Figures 2 and 3 demonstrate the character of extended source 
data typical of a point-source optimized scanning survey. Seen 
in Figure 2 in the top four groups of detectors is a compact 
HII region with a size somewhat less than 30 arc minutes 
diameter as indicated by the data. Note that the signature in 
the second detector group is very similar to the point source 
of Figure 1. Of course, the ratio of intensities is indicative 
of a low color temperature as could be expected for an HII 
region. It is clear that the signals from all 12 detectors are 
related to a single object which would pass most point source 
criteria. Obviously, care must be taken to note and measure the 
extended source attributes so that the signals are assigned to 
only one source and that that source is identified as a small 
extended object. 
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Figure 2. Sample Data - Small Extended Object 
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In Figure 3 an example of a much more complex extended source is 
shown. Keeping in mind that this sensor system is A.C. coupled 
with strong low frequency de-emphasis, the source structure is 
seen to cover nearly 6° in azimuth. That it is seen only 
weakly below the third detector group {zenith measurement) 
indicates that the source probably extends out of the field of 
view. Other scans at higher zenith angle may have further data 
on this source. Because the bandlimiting function of the 
electronics is well known, it is possible, in principle, to 
recover some of the low frequency information and reconstruct 
an intensity map of this object. The techniques for accomplishing 
this recovery and reconstruction are not well understood but 
are an element of this study. 

A philosophical question is raised by this source on data 
cataloging. If such complex sources are processed as intensity 
maps as an addendum to a point-source catalog, how should one 
treat the obvious hot spots in this object? We may be seeing 
stars imbedded in a large emission region - should these spots 
then be included in the point-source catalog as well as the 
maps'? Or should the point-source signature be subtracted from the 
map and only the extended emission shown? The ansv/ers to such 
questions are of primary importance in the design of the software 
system, as will be discussed in a later section of this report. 
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2,3 Classification of Infrared Source Data 

The characteristics of the astronomical sources being surveyed 
divides them into three distinct classes, as demonstrated in 
the previous section. The three classes are point sources 
(e.g., stars) with diffraction limited images; slightly 
extended sources (e.g., compact HII regions) whose signatures 
are point-1 ike but not diffraction limited; and diffusely 
extended sources (e.g., the Orion Nebula) with spatial 
structure extending several degrees. 

Point Sources for the purpose of the data analysis system are 
defined as IR detections with signatures characterized by the 
optical limit of the telescope system. Generally, this is 
diffraction limited with the actual image blur a fraction of 
both the optical components and the spectral characteristics 
of the filter system and the source. From a data analysis 
viewpoint, these signals are individually the minimum 
information content limit of the system. Typically, detector 
size and basic frequency characteristics are set by the point 
sources response needs. As such, they place the smallest 
bandwidth requirement on the signal transmi ssion> and processing 
systems . 

Slightly Extended Objects (SEO's) are not much different from 
point sources. These objects are not too much larger than the 
detector size, perhaps up to tens of arc minutes. As such, 
they can normally be 'handled by point- source processing if 
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some care is taken in measuring their extent. Since their 
information content in the data stream does involve a slightly 
wider bandwidth than true point sources, detection using point- 
source optimized filters will underestimate their size. 
Accommodating this extra information is a task of SEO processing 

For source sizes beyond a few tens of arc minutes, the 
information content of the source signal encompasses a 
significantly \/ider bandwidth than point sources, with the 
increase toward lower frequencies. This increase must be 
accommodated at all levels of telescope system and data 
processing design. A goal of such processing nii^ght be to 
produce a map- of the regiion in the form of a photo-image or a 
contour plot of isophote levels. Because of the distinctly 
different end product, extended sources might best be processed 
separately from the point-source system. The only overlap 
would occur at first detection where the interleaved information 
of point-like and extended sources is separated. As mentioned, 
this involves both philosophical questions on how to- handle the 
data and technical ones on how to treac the wideband information 
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3.0 DATA PROCESSING FOR POINT SOURCES 

This section is concerned with the techniques to be used in 
detecting and processing point sources. First> the sequence 
of actions to be implemented in going from raw, time sequence 
data to an organized final catalog is described. Then follows 
a discussion of the functional algorithms necessary in the 
sequence. The most basic function area is detection 
techniques where three common approaches are described and 
compared on the basis of gaussian statistics. Noise analysis 
logically precedes detection, including the technique for 
measuring noise values and the parameters contributing to its 
character. Another portion discusses the weighting functions 
used in various second-stage processing routines. Finally, the 
algorithms concerned with false sources are discussed. 

3.1 Sequencing of Point-Source Processing Routines 

A number of different measurements are derived from raw survey 
data. These values are used to discard false sources from the 
data base and to control the manner in which repeated observa- 
tions are weighted and combined. By separating the several 
decision gates into the proper sequence, the best throughput 
of data to the final catalog can be achieved. The controlling 
philosophy in designing this sequence is to make the most 
critical decisions first. With a goal of cataloging all real 
sources and no false sources, the first level of detection must 
be designed for maximum probability of detection, admitting a 
concurrent maximum in false alarm rate. Given then that all 



- 20 - 


detectable real sources are included in the detections list, 
following decisions are sequenced so that at each stage the 
largest possible amount of false sources are discarded first 
without affecting the real ones. Stated concisely, the 
statistical confidence level of the data base as a v/hole should 
increase by the largest possible amount following each decision. 

Figure 4 diagrams a sequence which analysis and experience with 
other survey data bases indicates closely approaches that ideal; 
each step is discussed briefly below In most cases the gates 
are simple tests on the magnitude of the confidence measure. 

Other gates are more complex combinations of criteria, such as 
identifications, background brightness in that direction, and 
channel performance. The first five gates could even vary as a 
function of time depending on the variations in sensor performance 
and background conditions. The last gate might be variable in 
order to maximize the real star content of the final catalog, but 
the scientific community generally prefers a catalog with some sort 
of statistical uniformity, which would mean a fixed gate perhaps at 
a brightness level corresponding to a 90% confidence of comp'leteness. 

The first gate, at step 5 of the sequence, does not discard a 
large number of the detected signals. However, since it is 
testing for specific false signals, it has negligible effect on 
the real ones. Tests performed here are for particle hits (or 


other rapid-nse phenomenon), telemetry dropouts, and dust 
particles. Since each of these has a unique signature very 
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Figure 4. Point-Source Processing Sequence 

1 Data Input 

2 Noise Analysis and Measurement 

3 , Detect! on 

4 Measure characteristics of source, determine 

1st confidence measure (CM) 

5 , Discard specific false sources (1st gate) 

6 Gate on rav/ statistics (2nd gate) 

7 Combine signals based on focal plane 

characteristics, re-do CM 

8 Gate on FPA anomalies (3rd gate) 

9 Combine multiple scans, re-do CM 

10 Gate scan anomalies, e.g., moving objects 

(4th gate) 

n Determine observation record for each source 

12 Gate total observation quality of each source 

(5th gate) 

13 Determine positional associations 

14 Gate to desired catalog statistics (6th gate) 

15 OUTPUT 
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unTike a real source, they can be easily tested for. Of course, 
the deleted sources need to be saved as a separate file for 
reference later and for status monitoring. 

The second gate, occurring at step 6, is the first statistical 
testing of source quality. The detection and measurement steps 
determined a number of values vvhich are somewhat independent 
measures of the source signature. For large signals, any one of 
these would be sufficient to qualify a real measurement; weaker 
sources pose a more difficult challenge. The values of 
correlation, S/N, amplitude, and duration are tested to accept 
or reject a detection. This step should trap a significant 
fraction of the false detections which pass through a gaussian 
3 sigma test statistic. The gate level here v;ill probably show 
the widest variations with time due to nonstationary noise 
effects . 

The next step combines potential multiple detector signals of 
some source. The associated gate will delete signals 
attributable to crosstalk betv/een channels, referring to the 
lists of false sources for time coincidence testing. Other 
focal plane effects will also be removed here as they are 
identified from detailed knowledge of the sensor system. 

If additional observations are made in a given area, a very 
strong gate can be created favoring real sources. With a 
detailed knowledge of the sensor performance and noise history. 



a very high confidence can be attached to sources seen 
repeatedly. The algorithm for this v/eighting is discussed in 
Section 3.4. Sources rejected by this test may be moving 
objects such as artificial satellites, planets, or asteroids, 
and could be subjected to further analysis outside the point- 
source flow. 

The remafning steps in the illustrated sequence serve to 
organize the final data base and catalogs. Decisions and 
gates here intend to qualify the catalog to some external 
standards . 

Noi’se Analysis 

The entire data processing scheme is strongly controlled by the 
noise characteristics. Specifically, noise analysis is needed 
in two parts of point-source analysis. First, a local "true" 
rms value of the noise is used as a detection criterion in 
several possible detection tests. This measurement is somewhat 
circular since the true noise evaluation must exclude sources, 
but the sources can't be excluded until they have passed a 
signal vs. noise detection. Second, a simple rms value does 
not fully characterize non-gaussian or nonstationary noise. 
Separate analysis is useful outside the processing flow to 
understand the ampli tude-frequency spectra of the noise, the 
effects of baseline offsets, the influence of digitization on 
the noise, and the success of source removal for noise 
calculations. This detailed analysis should be monitored for 
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its influence on the multiple decision gates of the processing 
sequence. 


A cotmient is appropriate on the origins of noise and signal in 
the data stream. Consider the signal (before the bandl uniting 
filter and digitizer) as a rectangular pulse of duration t. Its 
power spectral density is given by 


S(w) = ( 


sinrw/2 ^ 
■cw/2 ^ 


2 t. 


3.2-1 


which is illustrated below: 



rw/2_ 

The largest part of the signal's power is near zero frequency. 

If the noise is white with sharp bandlimits larger thanw/2 =3, 
the signal povier-to-noise power ratio decreases as the frequency 
increases. Then the overall S/N ratio (which is the integral 
over frequency of the signal PSD •? noise PSD) can be improved 
by removing the higher frequencies. Then the signal power will 
be reduced only slightly v/hile the noise is reduced more 
severely. If this "optimizing" filter follows the sin^x/x^ or 
the 1/x^ envelope of the signal's PSD, then the S/N improvement: 
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WTll be optimal . Hov/ever, this process tends to make the noise 
more like the signal so that the cross correlation coefficient 
of this filtered signal with a model signature becomes large 
even if no signal is present. This pi'oblem is further exacerbated 
when an A.C. coupled transfer function is used in the sensor for 
stability reasons. Then the strongest portion of the signal, 
near zero frequency, is de-emphasized. The low frequency noise 
IS also reduced, but the general effect is to remove a greater 
portion of the signal power than the noise power. Thus, it is 
desirable to use the best possible low frequency performance in 
the sensor system even for the detection of point sources I The 
problem for extended sources is even worse since \-/hen the dwell 
time T becomes very large, the first zero in the PSD falls at a 
very low frequency so that very little of the signal's power 
exists at the cutoff frequency of the point-source signature. 

One should note that the PSD of a rectangular pulse is the same 
v/hether the pulse v/as produced by a detector scanning rapidly over 
a point source or by a chopped or beam-switched sensor scanning 
slowly over the star. In the latter case, it is possible to 
produce several rectangular pulses for a single star (if the 
chopping rate exceeds the star's dwell time). These multiple 
signatures can be processed independently or co-added to increase 
the confidence measure of the detection. Hov/ever, since such a 
chopped system is looking at the source only 1/2 of its time, at 
least two cycles are needed to achieve the same CM as the scanning 
system. A further difficulty vnth a beam-switched sensor is the 
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confusTon in the system caused by the presence of a different 
star in each beam. 


In implementing these concepts the realization described below 
is taken for noise calculation. The noise value itself is used 
only as a numerical value and is determined only as a voltage. 

The digitizer number count cannot, in general, be used because 
the digitizer input is not linearly related to the detector 
output; rather, some logarithmic compression is typically used in 
the intermediate amplifiers. All processing then should occur 
after this compression is inverted. 

The noise value itself is calculated using a straightforward rms 
summation: 

= NIT ^^1^ " N(N-l) 

It can be shown that the effects of a constant value offset 
over the N samples have no effect on the square deviation. 
However, any organized change in the mean value of X. over 
the N samples easily becomes the dominant element of a^. That 
is, if the mean value drifts linearly over N samples, so that: 



3 . 2-4 
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where is the true MSD of the data. Obviously, the constant 
drift heavily v/eights the noise value through the second term 
in the above equation. Low frequency baseline drifting of the 
data, a phenomenon known to occur in IR detector systems, has 
much the same effect as does the presence of a real source 
signature. 

To compensate for real sources present, the noise value is 
calculated using continuous blocks of data without stars. This 
seems somev/hat circular, but in practice the blocks containing 
stars have fiSD values much larger than empty noise blocks. By 
monitoring the noise level over several blocks the star 
signatures are easily discarded in determining the local average 
I^dD. Alternatively, a low-pass digital filter can be applied 
to the sequence of MSD values v/hich cuts off this rapid 
fluctuation in the noise due to source presence. For example, 

<MSD>^ = K MSD^ + (1-K) <M$D>^._^ 3.2-5 

where the brackets <> indicate the filtered, or weighted, 
average value of the noise. The value of K is chosen to 
provide the appropriate frequency cutoff in the spectrum of 
MSD values. 

The effect of low frequency baseline drifting on the MSD 
calculation is more difficult to compensate. One approach is 
to reduce the number of values in a block {the N value) so 
that the second term in 3.2-4 is acceptably small. Since this 
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also reduces the confidence of the determination, an 
opcimum value of N exists which balances the accuracy against 
the error. Another approach is to apply a high-pass digital 
filter to the data to remove the low frequency drifting. This 
may be an optimal filter for point-source discrimination, and 
hence, an efficient approach to the noise calculation. 

However, because of the contribution of digitization noise and 
the possible effects of a nonstationary noise variance, the 
optimally filtered noise is not uniformly related to the raw 
noise. This tends to complicate the statistical control of the 
noise evaluation, balancing off the efficiency of the optimal 
filter approach for noise calculation. The third possible 
solution to the drifting baseline is to use a best fit determina- 
tion to subtract the baseline. By choosing a sufficiently small 
N, a first order orthogonal fit to tne block's data can 
adequately remove the effects of the second and third terms in 
3.2-4. 

A key problem of continuous noise measurement is the point in 
the processing sequence where the MSD is calculated. As will 
be discussed in Section 3.3, the raw data stream can be trans- 
formed into several possible domains. Whether the noise is best 
determined using the rav/ data, the optimally filtered data, or 
the correlated data must be determined from a thorough under- 
standing of the actual instrument performance. The gate level 
for source detection is then related to the calculated MSD 
value at a level v/hich corresponds to the desired error rate for 
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gaussian statistics. This is operationally adjusted by 
monitoring a detailed analysis of the noise parameters. (Self- 
adaptive detection schemes skirt the noise analysis issue under 
certain types of non-ideal noise by continuously compensating 
the detection algorithm for the noise character.) Most 
detection schemes assume that the noise is stationary, additive, 
v/lnte, bandlimited and gaussian; more deeply imbedded Is the 
implicit assumpcion that the noise process is random and 
ergotic. Real noise rarely achieves this ideal state, and 
accurate control of the performance of the detection scheme 
requires a knov/ledge of the deviations from the above standards.. 
The monitoring of this status is the second major function of 
the noise analysis requirements. To understand these deviations, 
we begin v/lth a description of the ideal noise. The deviations 
from this standard will be adapted in Section 3,3 to control the 
detection techniques. 

The data stream is assumed to consist of a pure signal and an 
additive random noise; 

r(t) = s(t) + n(t) 3.2-6 

Obviously, v/e have lumped all elements of r(t) that are not 
the signal s{t) which v/e desire to detect into the noise n{t). 

If part of this noise is a non- random function, the performance 
of our detection scheme will be degraded; first, the extraneous 
function will make a significant contribution to the magnitude 
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of the ndse variance, as demonstrated above. Second, and perhaps 
more sigaificant, the pov/er spectrum of the lumped noise will contain 
a strongly correlated function, skewing the statistical error rate. 
Further discussions will assume that this function has been 
subtracted from the received signal to generate r(t). 

Further, we assume that there are no multiplicative noise terms 
f[‘s{t)] Tii(t) rn 3.2-6. In real infrared sensors, and tn 
photon limited detectors in general, there is always a noise 
increase In the presence of a signal because of the statistical 
fluctuations in the photon quanta v/lnch are proportional to 
the root of the photon densixy. Hence, the noise rises from 
n(t) to rt(t) + k/s(ty when a source is present. In practice, 
however, the second noise term increases the unceriainty of the 
amplitude determination not the error rate of the detection. It 
is not feasible to actually measure the noise in the presence 
of a signal by subtracting the detected signal because sampling 
rates used are not high enough to completely determine the 
signal (100% certainty). The noise is calculated where 
signals are not detected, and 3.2-6 is assumed to hold so that 
this value can be transferred to the detection period. 

The random noise is said to be stationary if its probability 
density function is invariant to a shift of the time origin. 

Then; 


P^\+t' ’ \+t'-T^ 


3.2-7 
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The autocorrelation function R (t, t-x) of the noise is defined 
as : 


R^(t, t-x) = 




Xt.,) "IXt^^t-r 


3.2-8 


If 3.2-7 holds, then R (t, t-x) = hov/ever, even if 3.2-7 

IS not true, when the noise has a time invariant mean and 
Rw(tj t-x) = R„{t) then the noise is wide-sense stationary, 

A X “ '■ 

which IS sufficient for all detection techniques requiring 
stationary noise. Further, when the process is ergotic, then 
the ensemble average given by 3.2-8 is equal to the time 
average autocorrelation function. 


R (t) = lim 
T-^ 



t 

[ x(t) x(t-x) dt 


where the asterisk denotes complex conjugation. 


3.2-9 


The noise is white when its power spectral density S (w) is a 
constant (N /2) over the entire frequency range, and the auto- 

0 I 

correlation function is a delta function (N^/2)a(x). Here, 
the power spectral density is the Fourier transform of the 
autocorrelation function: 


S^(w) 


■ (t) dr 


— CO 


3.2-10 
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When the noise is bandlimi ted to a range |wj<s?, then 


Sj^(w) = Nq/ 2 iw|<J2 


R,(ir) 



sinflt 

fit 


3.2-11 


Finally, the gauss i an properties of the noise are defined by 
the probability density function- 


p(x) 


1 

/2it 



3.2-12 


For a digital processing system, there is a contribution to the 
uncertainty of the signal due solely to the quantization of the 
signal into discrete steps. The probability density function for 
this error is uniform over the quantization internal, so that the 
maximum possible error is one step, E^. Then fo'r either rounding 
or truncation, the MSD of the quantization noise is: 


E„2 n 

I 3.2-13 

^ m=0 

where h(mT) is the time-domain expression of the transfer function. 
The summation term in 3.2-12 is important when the transfer 
function is a logarithmic compression where the quantization noise 
grows with the signal compression. 


3*3 Detection Techniques 

The simplest method of point-source detection is the visual 
analysis of strip chart plots of the detector outputs. The 
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eye is a remarkable analog processor, but it is rarely possible 

to "see" signals which have a peak signal-to-rms noise value below 

5 or 6 because the eye cannot do the rms process very well. For 

the large data rate of sky surveys, this is an unfeasible approach 

in any case. The only benefit of eyeball analysis is that with 

such a low sensitivity, the error rate is very small. A number of 

digital algorithms have been used in IR surveys which are 

described below. Performance analysis for each is beyond the scope 

7 8 

of this report but is covered in depth in Whalen , or Gerlach. 
Generally, for ideal noise the stored replica correlator or 
matched filter is optimum; for nonstationary but otherwise ideal 
noise, adaptive detection techniques such as phase-coherence and 
wave period correlations achieve lower overall rates. However, 
the latter are difficult to realize and costly in processing time. 
Since the signals are wide-sense stationary over fairly long 
periods T, the adaptive techniques are discussed briefly below 
only for completeness. The complications of amplitude determination 
given a detection are mentioned, but error compensation is 
generally relegated to the weighting functions of later stages 
of processing as discussed in Section 3.4. The detection 
processor will be most efficient by concentrating on detection 
and then making an estimate of the detected amplitude, leaving to 
the multiple measurement routines the task of statistically 
controlling the amplitude accuracy. 

The simplest analysis test is the peak signal detector. A detection 
threshold of peak signal-to-rms noise is selected (e.g., 3a), and 



-34- 


any sample exceeding that level is selected as a signal. The 
following samples are searched for a maximum until the sample 
value again falls below the threshold level before another 
detection search is initiated. It is possible by this technique 
to choose signals only one sample long so that even with gaussian 
noise statistics the false detection rate is high. (At 3c, there 
will be 13 samples above the threshold in every 1000; at 10c, 
there are only 7.5 in 10 false pulses, but the system sensitivity 
has been severely degraded for real sources, too.) 

In general, the IR sensor system is chosen so that both point 
sources and larger sources are detectable. Then the bandwidth 
is larger than necessary for the point-source signature, admitting 
a larger portion of noise pov/er than signal. To rectify tins, it 
IS common to digitally bandlimit the data stream to the minimum 
for point sources. This is done with a recursion filter of the 
form: 


m 

I 

k=l 


\ ^i-k 


N 

I 

1=0 


h‘. 


^•-1 


3.3-1 


v/here the coefficients hj^ and h-j are determined from the desired 

9 10 

frequency response, as described in Gold and Rader ’ , and else- 

where. In digital filters, we are not limited to real filters 
since all future and past samples are available; we simply replace 
the l.h.s. of 3.3-1 with y^ ^ and set some hj^’s to zero and v/e 
have a "future" looking filter. (Essentially, these filters 
begin to respond before a signal appears. The reasons for using 
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them are complex, but basically they make the realization of the 
desired transfer function into digital form simpler.) 


The use of such a filter becomes optimum when the transfer function 
is chosen with the complete knowledge of the signature of a real 
source. In this case, the filter is the inverse of the expected 
signal, hence the filtered data is an optimal matched correlation 
output. In fact, there Is a slight difference between a true 
correlator and a matched filter, but the digital realization is 
identical. For an ideal matched filter, the filter function h(t) 
is the solution to: 


hjz) (T-z) dz = S(T-t) 

0 


3.3-2 


v/here T is the period of the expected signal s(t), and Rp(t) is 
the autocorrelation function of the noise. The presence of R^(t) 
has the same effect as a pre-whitening filter when the noise is 
colored, further, no assumption of gaussian noise character v/as 
made in the derivation of 3.3-2 so that the matched filter will 
be an optimum detector if h^(t) satisfies the relation for all 
time, and if the correlation function of the noise i‘S known. 


It IS important to note that 3.3-2 is a Freedholm equation of 
the first kind, and exact solutions are obtainable only for a 
limited class of autocorrelation functions the case 

of nonstationary noise, Rp(t) is determined from the locally 
stationary noise record, and 3.3-2 is solved for the optimal 
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filter. This is the simplest form of adaptive detectors in 
digital processing and results in a complicated software 
package which is very slow in execution. However, such an 
approach might be implemented piece-wise when some simple 
monitor calculation signals a significant change in 

Given a properly matched filter hQ(t), the data is transformed 
via 3.3-1, and a threshold crossing detection is performed on 
the output. As in the simple peak detection approach, the 
maximum sample is selected to locate the time of the signal. 

For white noise, the correlator output, y(t), is: 

T 

y(t) = / r(t) s(t)dt 3.3-3 

0 

which is a Bayes-best estimate of the unknown amplitude A, since 
fT 

r(t) s(t) dt 

A = j 3.3-4 

[ s2{t) dt 

h 

and it is assumed that the reference signal s(t) is normalized so that 
the denominator of 3.3-4 is unity. By extension, the filter output 
3.3-1 IS the best estimate of A with the weighting function Rj^(t) 
accounted for. Note that for white noise, the solution of 3.3-2 
as used in 3.3-1 makes 3.3-1 equivalent to 3.3-3. 

All of the preceding approaches are variations on the peak 
detection technique with various forms of signal conditioning 
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occurring before detection. A second class of detection techniques 
ignores tSjs peak signal and concentrates on the zero crossings. 

Since this approach discards amplitude information in favor of 
•signal period and phase detection, it is possible to make statistically 
independajt tests for amplitude (by peak detection techniques) and 
existence (by v/ave-period detection). This approach will give the 
maximum detection probabilicy since all of the knowledge of the 
signal is teing used. However, in IR survey processing the wave- 
period approach heavily discriminates against even slightly 
extended sources. Essentially, the zero crossing detectors make 
assumptiojis on the source characteristics rather than on the 
detector response characteristics, and thus are not well suited 
to the goal of an unbiased survey in any sense. 

However, when a specific class of IR objects is to be searched for, 

the wave-period processor may be an ideal approach since it 

intrinsIcaKly is insensitive to nonstationary noise. This is 

because a zero crossing detection scheme relys only on the 

frequency probability distribution, not the amplitude 

variations. Since a bandlimited system strongly controls the 

frequency spectrum, the temporal variations in noise amplitude 

are relatively unimportant. The v/ave-penod technique v/ill be 

especially fruitful in multicolor surveys when searching for 

specific oior ratios and wavelengths (e.g., cool extended 

regions, or hot compact clouds). Such goals are outside the 

general sky survey, however, so the reader is referred to 
8 

Gerlach for detailed discussion of the wave-period algorithms. 
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In the interest of computing reduction, a number of approaches 
to source detection have been tried v/hich involve much less 
computation than the optimal filter or correlation approach. 
However, it was found that more fruitful results were achieved 
by procedural modifications of the correlation technique than 
by simplistic algorithms. For example, in computing noise, the 
square variance is determined rarher than the rms value since 
the SQRT function is very slow in execution. Of course, the 
detection algorithm must be modified to suit the use of the 
MSD value, but the increased computation here did not exceed 
the savings in eliminating all SQRT functions. 

Another technique successfully tested was the reduction of the 
sample size of the model function. This savings could be 
achieved because in one program the sensor sample rate was 
nearly four times the Nyquist limit (defined as twice the upper 
frequency limit of the information). A number of averaging and 

f 

decimation techniques were tested, all of which performed about 
the same as the full size correlation. This was expected since 
little further information is added by the excess samples, and 
also because the limiting noise on some parts of the data v/as 
the quantization error. In fact, a three sample slope 
predictor smoothing function actually had a lower error rate 
because of a reduction in the noise variance. This was followed 
by a correlation detector (optimal filter) maT:ched to the 
smoothed data stream and was very similar to the complete 
matched filter in overall complexity. However, if detection 
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methods less complex (and less accurate) than opximal correlation 
were desirable for other considerations, the decimation-detection 
approach can save computing time nearly proportional to the 
decimation level. That is, 3X decimation takes 1/3 processing 
time. The actual error performance of this concept has yet to 
be examined. Further, decimation increases the amplitude 
uncertainty because the number of samples in 3.3-4 drops. 

3-4 Weighting Functions 

Weighting functions are used in secondary processing stages to 
combine the values measured in the detection stage to produce 
an estimate of the true value of the source amplitude given 
several measurements. They are also used to create a unatary 
measure of the signal confidence given multiple detections. In 
a sense, they are also used to determine the existence of 
multiple measurements in that the positional matching of 
independent scans is implemented exactly as a unitary 
weighting function would be. 

Amplitude weighting is the most important task since the 
detection schemes generally ignore photometric accuracy require- 
ments. This results in a wide scatter in the single scan 
calibration curves. Since the system noise is nonstationary and 
since the detector response is typically variable over long 
periods, the best calibration methods involve fitting standard 
star brightness to measured voltages in a least square sense for 
each stationary segment of a scan. Thus, secondary uncertainties 
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In the brightness of new stars are introduced by the calibration 
process, i^ong the possible methods for determining the best 
amplitude estimate are simple averages or weighted averages. 

The simple average amplitude estimate is 


A = 


1 


N 

I 

1=1 


A. 


3.4-1 


Where the are the N individual measures. When N is small 
and the calibration is a single measurement on each of several 
standard sources, this is the best amplitude measure. Hov/ever, 
when the calibration of a particular detector against a 
particular star is repeated several times, then knowledge is 
obtained on the characteristic probability distribution function, 
P,(A), for each detector j. Then the best amplitude estimate is 

u 


However, it is uncommon for the unknown star to be surveyed 
repeatedly by the same detector. Then more complex information 
is needed on the probability distribution over all detectors, 
and 


A = 


N 

4 , 


M 

0^1 




P, (A,) 


3.4-3 


The establishment of the complete probability density function 
IS, of course, a major responsibility of the survey calibration. 
The above relations 3.4-1 can be further complicated by the 
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inclusion of the known Information on the certainty of each 
measurement. Since, in general, this S/N value is available for 
every detection, it too can be included as a weighting factor 
in 3.2-3 so that. 

<A>'=-^- * J I A,j (S/N), pj (A,) 3.4-4 

I (S/N) 

i=l T 

where the first term is the normalization factor for the (S/N) 
weights. 

The relations 3.4-1 to 3.4-4 serve to dec>^ease the uncertainty 
of an amplitude measure in a statistical sense by making a best 
estimate average. If p(A) is gaussian or nearly so, then the 
multiplicity of measurements gives a photometric accuracy 
improvement over the single measurement uncertainty of a factor 
of for the average amplitude. Thus, a 10% photometric 
accuracy can be achieved by 4 measurements of 20% error or 
25 measurements of 50% uncertainty. 

The positional weighting problem occurs because of a non-uniform 
spatial distribution in source location. The primary cause of 
the non-umformity is the typical use of double-staggered 
arrays of detectors so that a portion of the sky is measured by 
two detectors with adjacent portions covered by only one. 
Typically, the singly covered strip is twice as large as the 
doubly covered one, and it is difficult to locate additional 
scans with sufficient accuracy to place a second or third scan 
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1n the singly covered region, thereby recovering unifo,nriity of 
coverage. The probability distribution across the overlapped 
and non-overlapped portions of the detector is the product of 
a uniform distribution in each section times the normalized 
energy distribution function of the star image. 


If point sources are assumed, the energy distribution can be given 
by the diffraction limit distribution of: 


I(p) = 


rn 




2il-j (kap) 
_____ 


(. 


2J.j(keap) 

Reap 


•)>" 


3.4-5 


where e is the radial obscuration factor, p is the radial 
coordinate of the diffraction pattern, k = 2i:/x, a is the 
aperture radius, and the central peak intensity of the 
diffraction pattern. This is normalized by the integral of 
I(p) over all p. Note that 3.4-5 is a function of wavelength. 
When a broadband filter detector system is used, the energy 
distribution is given by the integral over VMvelength of 3.4-5 
times the filter function F(d). 


For non-di ffractron limited optics, other intensity functions 
can be used as given in Born and Wolfe. Since each source has 
a positional uncertainty in cross scan given by the product of 
the uniform distribution and 3.4-5, and a similar uncertainty 
product -in the scan direction, the combination of multiple 
detections implicitly assumes an adequate overlap of the 
individual positional uncertainties. In previous programs, the 
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distn'bution functions were assumed to be sharp, rectangular 
boxes corresponding to the detector instantaneous field of view 
for each detection. This proved to be adequate for combining 
detections but involved some care in implementation in the 
software because, in general, no corner of one box fell inside 
the second box. For survey missions of higher sensitivity, such 
an approximation must be examined carefully to develop the best 
combinational approach to multiple overlapping positions. The 
complexity of convolving third, fourth, or further measurements 
adds yet more difficulties. The convolution must be done with 
care if the multiplicity of measurements is to reduce the 
uncertainty measured position while providing multiple detection 
confirmation. Further, the time-to-position transformation for 
each scan introduces a third level of uncertainty for multiple 
scan combinations. 

The most complex weighting problem for large IR surveys is 
the determination of a confidence measure for each source, 
given multiple measures in (possibly) multiple spectral bands. 

The taskjs more difficult than even the combination of all the 
measurements since the survey data has reference to other surveys 
made in similar wavelengths and in other parts of the spectrum. 
The wide variation in noise and sensitivity from detector to 
detector and from measurement to measurement must be accounted 
for. The origin of the various multiple measurements does allov/ 
a reasonable separation of the combination task into a series of 
combinations. The sequence presented in Figure 4 (Section 3.1) 
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indicates a possible ordering of the several combination 
steps . 

The first level of multiplicity m detection occurs in the 
focal plane at step 7. Combinations here account for the over- 
lap of detectors and the marching of the multiple detector 
scans. Lacking external knowledge of the spatial extent of a 
detected source, signals occurring on two adjacent channels are 
attributed to a single source if the signature on each detector 
IS essentially point-like. {See Section 4 for non-point 
objects.) Then, if a time spacing berween measurements is 
within the bounds set by column spacing, scan rate, and 
associated uncertainties, the source is assumed to transit the 
regi on of overl ap between the two detectors . Such a pai r of 
measurements is then combined in amplitude, its positional 
uncertainty assigned to the overlap region, and its confidence 
determined as described below for combining S/N values. If 
detections of proper time spacing occur in other colors in the 
same. detector row or rows, then the multiple color measurements 
are assigned to a common position, retaining the separate 
amplitudes in each color. Naturally, this multiple color combina- 
tion occurs after detector overlap testing. 

Within a single scan, the combination of focal plane character- 
istics is done in the time domain. This prevents positional 
uncertainties associated with the sensor pointing history from 
affecting the combination success and error rates. For 
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independent scans the combination of repeated measurements 
necessarily occurs in a celestially fixed coordinate frame. It 
IS intuitive that the time-to-position transformation must be 
done carefully to maintain the minimum error box size. 


For two independent measurements the best combination of 
information occurs in a co-adding sense. That is, the peak 
signals and the square noise variances are combined and used 
to produce a new S/N value. Measures of the signals and noises 
in common units must be retained to make this combination 
properly. The new S/N is given by: 


(S/N)^ 


$1 + S 2 


3.4-6 


where the subscripts 1 and 2 refer to the first and second 
Signal's and noise N. Generalizing 3.4-6 for n-tuple 
measurements. 


(S/N)^ 


N 


( 


N 5 1 
’ 


3.4-7 


The reader should recall that the MSD noise is calculated so 
2 

that the N 's are immediately available; then the square of 
the new S/N is found by squaring the sum of signal values and 
dividing, thus saving a slow square root operation in favor of 
a faster multiplication. 
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Two extensions to the above algorithm can be made. First, the 
non-detection of a source at a given position is automatically 
handled by setting one of the S^'s to zero. This will tend to 
underestimate the (S/N)^ value since the signal could have been 
as large as K times the rms noise and still not be listed as a 
front-end detection. K is the threshold level at the first 
detection algorithm. Second, 3.4-7 can be extended to include 
weighting factors based on the sensitivity of each detector. 

Since the 1th detector could have a sensitivity different 
from the average, the 1 signals in the numerator of 3.4-7 can 
be multiplied by R.j/<R> where R-j is the responsivity of the 1th 
detector and <R> is the average responsivity. More complex forms 
of the detection probability function can be used if sufficient 
information exists to describe Pd(i). 

When the additional measurements are in different colors, 
special care must be used in combining the confidence measures. 
The spectra of the source is not flat over the wavelength 
bands covered by the detector system. That means that 
different classes of sources will have different ratio responses 
in the multi-color measurements, and the ratios will depend in 
part on the relative sensitivities of the wavelength bands. This 
immediately suggests a weighting factor for combining the S/N 
ratios. A table of color ratios versus temperature can be 
created by convolving a black body spectrum at each temperature 
with each filter-detector combination. Entries in such a table 
are used to give a weighting factor for each color band based on 
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the source color temperature which is then used to combine the 
respective S/N values analogously with 3.4-7. (This seemingly 
circular calculation can be achieved in practice by using the 
already calculated amplitude estimates to determine which is 
then used to v/eight the S/N values.) 

3.5 False Source Algorithms 

During other survey programs a number of phenomena v/ere 
identifTed which produced detectable signals. These were 
initially identified as potential sources, but inspection of 
the data records revealed some unusual characteristics. 

Analysis indicated that several mechanisms produced false 
signals which were so unique that they could be fully 
eliminated from the data. Cosmic rays and other ionizing 
particles produced characteristic rapid rise signals; dust 
particles exhibited a typical out-of- focus doughnut covering 
many detectors, and off-axis Earthshine produced azimuthal ly 
correlated extended objects. 

Figures 5 and 6 illustrate radiation particle hits. These 
events ionize the detectors, often saturating the conduction 
band and produce signal pulses characteristic of the impulse 
response function of the sensor electronics. If the signal were 
examined at the output of the detector amplifier, such pulses 
would be nearly delta functions with a duration governed by 
the time necessary for the bias supply to dram off che ionized 
electrons. This time is characteristically milliseconds with 
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nse times of the order of microseconds per volt. The 
electronic bandpass filters degenerate this sharp spike to the 
illustrated signal. The rapid rise of these signals is 
preserved v/ell enough to distinguish large pulses from real 
sources very easily, however, and rise slope has been used 
in many programs to identify such spikes. Some confusion 
occurs when the spike heights are smaller because the 
sampling rate begins to confuse the rise slope calculation. 
Typical spikes reached peak or A/D limiting value i,n 2 to 4 
samples whereas the sharpest point sources covered 8 to 10 
samples before reaching its peak. 

Analytical models of the radiation particles have indicated a 
spectrum of potential pulses should be seen by these IR 
detectors to much smaller amplitudes than actually experienced. 
This could be caused by failure of the slope discrimination 
algorithm for small amplitudes or by inaccuracies in 'the 
model. However, if small spikes are missed by the algorithm, 
they vnll likely remain in the data lists because they appear as 
high S/N sources. Multiple observation tests must be carefully 
arranged so that a single large S/N signal cannot pass in 
order to avoid this problem. 

Figure 5 shows a second difficulty of particle events. 

Typically, their electrical signals are strong enough to cause 
significant crosstalk signals in other detectors. Since these 
have been doubly band filtered, the crosstalk signal looks much 




Figure 5. Sample Data - Radiation Spikes with Crosstali 
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more like a real source than the original radiation event. 
However, these are easily identified by their time correlation 
with the particle event. If the signal times of eliminated 
spikes are retained, then the crosstalk signals can be tested 
for and eliminated. 

Figure 6 illustrates the characteristic signature of a dust 
particle. Since the particle is very nearby, the image is 
severely out of focus producing an image in the focal plane of 
the illuminated primary mirror with the central spot darkened 
by the secondary mirror system; the size of this doughnut 
depends on the distance to the dust particle. Because the 
image is out of focus, each detector is typically fully 
illuminated by the particle. Simple radiation balance 
calculations give an equilibrium temperature of around 270®K 
for these particles illuminated by Earthshine (all observations 
were made in the sun's shadov/) so that the 4 micron band has 
very little energy and the 20 micron band is most strongly 
excited as the figure illustrates. The image's double hump 
and the low color temperature are the characteristics v/hich 
allow- simple discrimination algorithms. One must be sure to 
check all possible channels for time coincidence signatures as 
well since the detector which transits the edge of the doughnut 
will have only a single hump signature. 

I 

f 

Another type of i^alse source is the nonstationary space 

I 

bodies. These include Earth satellites, planets, asteroids, 

! 
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meteors, and comets. The planets are readily identified 

because of their known positions. The caveat that the planet's 

location at the time of observation must be known applies. For 

the outer planets the proper motion is very small, and observation 

time is not critical. For the closer ones, hov/ever, over the 

course of a year-long survey, the total motion will be significant, 

and the varying viewing aspect due to the sensor's orbit must 

be accounted for. ^fore difficult to deal with are the 1200 

known asteroids since their orbits are not accurately determined 

in all cases. Even worse, extrapolation of the known asteroid 
12 

population indicates that tens of thousands of completely 
unknown objects could possibly be seen by a very sensitive 
infrared system. A great body of science can be recovered, 
however, if the motion of these discovered asteroids can be 
used to determine orbital elements; the resulting distance 
knov/ledge allows determination of albedo and size parameters 
for the asteroids. 

The most difficult moving objects to deal with are Earth 
satellites. The large number of these presents a formidable 
difficulty, and their very rapid relative motion compounds the 
problem. Hov/ever, a good deal of these sources have known 
orbits reducing the task to checking the lists for potential 
identification. However, the positional computations involved 
are not trivial. Satellites in nearly synchronous orbits could 
be a greater problem because their relative motion will be 
smaller. As with the asteroids, a major task vnll be the 
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association of a given observation with an object seen in a 
previous observation. If the lists of possible moving objects 
(that IS, all large signals seen only once in a given position) 
are large, it may be difficult to trace a single object's motion 
from observation to observation. 
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4.0 DATA PROCESSING FOR EXTENDED SOURCES 

In contrast to the thorough analysis of techniques, background, 
and algorithms described in Sections 2.0 and 3.0 for point- 
source surveys, very little is understood of the data processing task 
for extended objects. As discussed in Section 2.4, this 
includes objects v/hich are slightly larger to very much larger 
than the detector resolution. In some ways the desired results 
are similar to the point-source cataloging of Section 3.0, but 
in others the task is totally different. This section will discuss 
first the extended obj'ects winch are similar to point sources, 
then the wide field sources, their resulting final products, and 
the approach to processing them. 

4.1 Slightly Extended Objects 

SEO's are not much different from point sources. In general, 
their characteristic signatures are only perturbations of a 
point-source signal. Typically, they will be seen in only one 
or two detectors, and the signal vnll be tv/o to three times 
longer in duration than point sources. Photometrically, their 
edges can be as sharply defined as point sources so that the upper 
frequency limit of their signal is the same as point sources; 
their lower frequency is only 10% to 30% lower than the point 
source and is due only to the increase dwell time caused by a 
source image a few times larger than the blur circle. 

Physically, these objects are associated with large circumstellar 
shells and bright knots in HI I emission regions. 
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Detection of SEO's can be done exactly the same as point sources 
if the increased dv/ell time is allov/ed for. One possible 
approach to this is to use a double correlation model matching 
the characteristic rising portion of the signal separately from 
the characteristic falling portion. The variable spacing 
between these two edges then gives a measure related to the 
sources angular extent. Another method is to use point-source 
correlation, but simultaneously test the peak signal-to-noise 
ratio. Then a source with a high enough S/N value but a low 
correlation coefficient would indicate the presence of an 
extended object, and measurement of the pulse width would be 
related to the angular extent. 

For best performance, the detection routine for SEO's should 
use a digital fitter ma-tched to the bandwidth of the source's 
signature. This filter would be similar to the point source's 
but of slightly larger angular extent. The upper frequency 
limit is determined by the duration of the SEO pulse. If a point 
source produced a rectangular pulse of duration t (equal to the 
dwe>l time on the detector), the power spectra' v/ould have its 
first zero at a frequency f^ of 1/2t. Then an SEO with a dwell 
time of x(l+e) would have a lower cutoff frequency of f^Cl-e). 

Once the SEO has been detected and a value assigned to its extent, 
the source can be treated just as a point source measurement. 

The crrtena for multiple observati'on and reasonable spectral 
matching and brightness determination follow point-source 
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reqmrements exactly, except that the positional uncertainty is 
increased due to the size of the source. Since these objects 
will not (by definition) cover more than two detectors, the 
cross-scan position error is relatively unchanged, but the 
scan error should increase by roughly the angular size. 

The SEO's are assumed to have sharply defined edges and 
reasonably uniform brightness distributions across their discs; 
some error in determining their size results from such limitations, 
so It may be worthwhile to approximate the size of these objects 
in quantized steps. That is, if the size error is ±3 arc 
minutes, then SEO's could be given as 0, 3, 6, 9, 12, ... arc 
minutes. This approach would save some computation time over 
calculating the individual size to one or two digits without 
losing information. 

4.2 Photometric Mapping. 

For truly extended sources, the brightness distribution of the 
source determines the resulting data signature. The analysis 
of this data intends to recover the spatial variations in 
brightness and present it in a readily understandable manner. 

The tv/o most common presentations are contour maps of the 
brightness and photo images. The contour map has the advantage 
of being easily quantized, while photo images are more useful 
in understanding variations near the resolution limits of the 
survey. The techniques of producing these products are in 
vn despread use on a number of other programs, the method to 
gather and process the initial data is much less understood. 
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Because it is easily quantized, contour mapping is the most 
coimionly used data product for infrared and radio surveys. 

Data input for these measurements is typically from beam 
switched telescopes with the two beams aligned with an 
individual scan line and. multiple adjacent scan lines made 
over an exteiTded emis^sion re’gion. The individual scans are 
essentially sequences of difference measurements. These 
sequences can be algebraically inverted to produce the brightness 
values along the scan line with some errors introduced by the 
inversion process due to the D.C. instability of the numerical 
inversion. The effective resolution element is typically 
somewhat larger than the beam size due to these instabilities. 
Multiple adjacent scans then give an array of local brightness 
measurements which is then used as ipput to standard contour 
plotting routines. 

Photo image processing is a powerful analysis tool not used 
extensively in astronomical studies but common in planetary 
investigations. Using the same array of brightness elements as 
described above, a photo image is produced by converting each 
brightness value to a grey scale (or a color scale) value on a 
printing device or a cathode- ray tube. Using multiple strike- 
overs an eight-level grey scale, for example, can be produced 
on a standard line printer using the algorithm below. 
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The greatest difficulty in using photo image representations of 
the data is that each individual element is coimionly not an 
independent brightness measure. Rather, it is overlapped by 
the information of adjacent elements, a result of both the 
measurement technique and an artifact of the data recovery 
algorithms. The resulting image rarely has the resolution 
implied by the beam size of the system, and the photo product 
appears to have very low contrast. A number of techniques have 
been- devised for planetary image processing to improve this 
situation. These techniques generally tradeoff the photometric 
accuracy of the image for the spatial resolution desired. Thus, 
photo images are a supplement to contour maps of source intensity, 
not a replacement. The algorithms to be used for contrast 
enhancement and for resolution enhancement will be reviewed in 
phase 3 of this study; these v/ill be adaptations from similar 
current efforts in image processing. 

A more difficult problem is the creation of the array of 
intensity measurements. Survey instruments typically do not 
use beam switching, relying instead ^on spatial scanning to 
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modulate the signal from the infrared detectors. The outputs 
are bandlimited in frequency to avoid the difficulties of D.C. 
drifting so that the information of the wide-scale intensity 
distribution of a source is lost or at best compressed severely. 
Successful mapping of extended regions requires that the 
information content at the frequencies corresponding to the 
desired spatial extent be restored. It is immediately apparent 
that the measurement technique has performed a spectral 
compression of the spatial image. It is thus necessary to 
understand the compression* function and successful ly invert it 
to recover the desired intensity data. Very little is currently 
understood of the scope of this task and the potential limitations 
candidate techniques for this inversion are either algebraic or 
an application of orthogonal transformations. 

Algebraic restorations are the simplest to implement. Given the 
transfer function of the scanning telescope system, the (digital) 
difference equations can be written, as described in Gold and 

Q 

Rader and in Section 3.0 of this report. Then the n equations 

\ 

relating the several input and output samples are algebrai cal ly 
inverted to express the input vaiues as a function of output 
samples. This system is then incrementally solved given the 
detector's sequence of measured output samples. Several 
difficulties arise with this approach. Since the algebraic 
inversion is based on the ideal transfer function, there are 
inherent limitations in the accuracy of the restoration due to a 
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mi sfit between the actual system and its model transfer 

function. Further, the algebraic methods are inherently unstable 

13 

in the presence of noise. 

Since the A.C. coupled transfer function typically has zeros at zero 
frequency, the inversion vn'll have unstable poles at zero 
frequency. This D.C. instability will require iterative 
fitting of short scan segments with the D.C. value of each end 
defined (or at least assumed). The task is the digital 
equivalent of the solution oF a non-linear differential equation 
with defined boundary conditions, a formidable task. This is 
further complicated by the effects of digitization \'/hich 
necessarily introduce at least a one-bit uncertainty in the 
loviest frequency of the system data which drives the D.C. 
instability. Coupled to this are the effects of inverting 
wide-band noise and the algebraic method becomes almost 
untractable. It is difficult to envision a successful inversion 
unless the signal fs so large that noise can be smoothed out (a 
form of severe high frequency filtering) and digitizer uncertainty 
becomes negligible. In such a case, hov/ever, the spatial 
resolution of the system is degraded by the smoothing. 

The difficulty of using a direct inverse of the transfer function 
can be seen as follows. The sequence of output values r^^ are 
related to the noise n^ and the object's intensity distribution 
0^ by the transfer function h^^^. 


That 1 s : 
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4.2-1 


The inverse of this, using M=N observed samples is in matrix 
notation: 


^ = [h]“^ R - [h]"^ N 


4.2-2 


Now if N IS an unknown random function, then the second term in 

the r.ii.s. of 4.2-2 is the error in reconstructing the original 

intensity distribution. Since most transfer functions are 

simple, [h] is mostly zero, with small elements near the diagonal, 

so that [h]"^ has many large elements. Then for samples with finite 

noise n , the error in reconstruction is still randomly distributed 
m 

1 3 

but very large. An example by Phillips with an input signal 
plus noise, S/N > 2000 was reconstructed to a S/N' less than 3. 

A potentially more successful approach to the task involves the 
use of orthogonal transformations. Essentially, the scan 
matrix is transformed to a domain which allows some separation 
of the noise and digitizer effects from the data. The data is 
then weighted to recover the low frequency information and 
re- transformed to the original domain creating the intensity 
array. This is then mapped by photo imaging or contour plotting 
and analyzed. This approach is commonly used in television 
image compression codes where the compression and recovery are 
externally controlled. In the survey problem, our goal is to 
discover the original compression code and invert it with the 
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mimmum error. Transformations which have been successful in 
such applications include Fourier and Hadamard methods and 
Karhunen-Loeve transformations. The latter are probably optimal 
in the sense of mini mi m least- square errors in the ultimate 
results, but except for simple (and thus limited) approximations are 
unwieldy to implement. Since the Discrete Fourier Transformation 
is a limiting case of the Karhunen-Loeve transformation for 
independent data, it is intrinsically attractive. The forward 
DFT IS given by 

Fi, ° Z fn “P (-i2»nk/N) 4.2-1 

I' 0<n<N-l " 

and its inverse is: 

^n"l¥ I F. exp {i2irnk/N) 4.2-2 

where the input data sequence is (fQ, f-j , ... and the 

transformed data are (Fq, F.j , ..., The transformed 

sequence is naturally ordered by the index k, with 
increasing k corresponding to higher frequency components. 

The Fast- Fourier Transformation (FFT) is an efficient method 
widely used to compute the DFT as given above. 

4.3 Detection of Extended Emission 

The previous section considered the problem of recovering the 
spatial intensity information for extended sources. Since a 
significant portion of the potenti-al objects is known a prion., 
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the major difficulties are in the reconstruction. It is assumed 
that the scan data is made available for the full known extent 
of an object such as the galactic center and that data is 
treated to recover and map intensity distribution. A second 
problem exists for those objects which are not knovm as 
extended emission regions in discovering them. Of course, it 
would be possible with unlimited computing resources to recover 
the entire intensity over all the sky and then "discover" 
unknown emission from the resulting all-sky map. However, for 
surveys designed to gather stellar information as well, the 
instrument’s limitati-ons imply a sacrifice of some extended 
source capability. With limited resources and compromised 
data, a more worthwhile approach would be to identify the 
region in the unprocessed survey data and then map the limited 
area of interest. 

This task is not as difficult as one might suppose from 
extrapolating the point-source detection problems. The 
mapping algorithms are intrinsically limited in the accuracy 
of the recovery by noise and instabilities which implies 
constraints on the dimness of the extended source (or the 
strength of spatial intensity gradients), on its upper size 
limit, and on the achi-evable resolution. Generally, the 
mapping procedures will require peak-to-rms S/N values of 20 
or more over regions not exceeding 10 degrees. As long as the 
sensor electronics do not exhibit D.C. drifting over a 
comparable range, a simple peak detecting -algori thm measuring 
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the local mean signal in a vnndow larger than ten degrees 
should discover most of the unknown regions which are mapable 
with a survey instrument. Section 3.0 covered the algorithms 
applicable to the peak detection task. 
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5.0 DATA PROCESSING SYSTE!»1S 

There are a number of ways to implement the survey data processing 

scheme described in Sections 2.0 through 4.0, depending on the 

facilities and resources available and on the operational 

\ 

constraints. Previous survey programs have generally had 
unrelated observing and data processing schedules with the 
data reduction talcing four to twenty times as much total CPU 
computer time as the sensor's observing time. For example, 
processing the data collected on three 100-minute orbits of 
the CMP sensor required over 120 hours of computer time on an 
XDS-Sigma 7 machine. On the other hand, the massive tasks of 
the IRAS mission allov/s only 18 months to process 8 months of 
data, including the generation of many final products (catalogs, 
overlays) not involved in the CMP effort. The tremendous 
consumption of CPU time in previous programs indicates a need 
to organize an IR data processing system with care. The 
follov/ing sections describe a basic division of che processing 
task into tv/o sections, the front-end detection and the back-end 
cascade, and a number of parallel monitoring functions. This 
structure is dominated by the point-source processing requirements 
v/hich are well understood. The extended source mapping is 
roughly a parallel function v/ith the interaction points indicated 
in the flow diagrams. 

5.1 Overall Computing Structure 

Figure 7 diagrams the suggested processing flow structure; this 
is oust a .formal 1 zed grouping .of the .processi ng tasks discussed 
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Figure 7. Computing Structure 
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urexlously. In the sof tv/, ara packages, the pKa-conditiomng 
front-end processor and part of the monitor functions can be 
combined to form a single executing program. The back-end 
cascade and remaining control /mom tor functions form a second 
operating package which can run off the data tapes' output by 
the first package. Final products generation is best run as a 
third independent group since the interaction v/ithin this 
package fs dominantly based on graphic and publication 
requirements, not on scientific decisions. The first group 
processing is generally run as a fixed operation designed to 
extract the statistically maximum cimount of information from the 
data stream, control of these functions is based only on 
system load requirements. The scientific decisions interface 
with the processor flow in the second group, where tradeoffs 
occur to maximize the quality of the data products. The extended 
source processing is a fourth software package which uses the 
pre-selected data output by the pre-conditioning phase to map 
known regions of interest and the data on newly detected objects 
from the front-end processor. 

In small-scale surveys, each step of the sequence of Figure 7 
can be executed sequentially for the entire data block. For 
larger surveys the several steps would be running in unison as 
the data from each step was processed and passed on to the 
next. 

''For very large-scal'e*’'processTng^tasks, ~the 'Operation can*be 
split into five distinct packages with the pre-conditioning 
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being separated from xhe front-end processor. This would allow 
the use of multiple dedicated computers or a large-scale 
parallel processor to continuously execute all the data 
phases. In this way the later stages can process data as it 
becomes available from the preceding level. Especially 
advantageous in this case would be the use of hardware 
processors dedicated to specific tasks within each group. For 
example, the basac norse block cal cul action could be done by a 
special CPU in the pre-conditioning. Likewise, a dedicated 
correlation processor for the multi-channel data could be 
per Forming the data transformations to convert the raw data 
stream to optimally fiTtered or correlated data streams. 

Another process'or v/ould then monitor these outputs and the 
noise data to operate the detection function. Micro-coded 
hardware processors can operate at very high speed if their 
computing task is sufficiently limited; by using separate 
processors for each basic task in the pre-conditioning and front- 
end detection packages, a very high throughput can be achieved. 

A supervisor computer could perform the monitor and control 
operations for the tv/o primary phases and channel the final 
outputs to disk or tape storage devices for access by later 
stages . 

Once the data has passed the front-end stage, multiple 
processing is no longer attractive since the purpose of the 
third and fourth stages ns to condense the mass of data into an 
ordered -.catal og . ^This tas k - requ >res -s ophi-s-tvoated deei s i ons 
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for data combination making a general-purpose computer more 
attractive. This is especially true for the final products 
phase where high-level, high-speed graphics are required. The 
mapping routines of the Extended Source Processor also require 
the pov/er of a large and versatile computer. The overall 
processing system could consist of an array of high-speed 
special hardware processors controlled by a dedicated mini- 
computer. This i/ould’ feed data to storage devices which are 
accessed by a large, general-purpose computer. The remaining 
processing would be done by software packages on this machine 
feeding the final outputs to the appropriate storage devices. 

Monitoring functions performed in the first stage by the mini- 
computer can feed real-time interaccive devices. This would 
allow the data processing scientists to discover flaws and 
problems in the data quickly enough to make corrections before 
excessive processing time is consumed. In previous programs 
xhis interactive analysis was done by repeated batch processing on a 
large computer, with several hours of CPU time commonly 
consumed before the unexpected characteristics of the data v/ere 
understood and accounted for in the softv;are«. Further, each new 
set of data required more interactive processing. By replacing 
this multi-pass processing with an interactive facility, a 
sizable portion of the CPU consumption can be saved. 

With the prior understanding of the data quality, the monitor 
fuactions^af „the later stages-.can.be raduGod to -simple 
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che,ck1ng of the results of each decision level. For example, 
foreknowledge that a block of potential sources came from low- 
quality data v/ould allow the scientist to alter a decision gate 
to prevent an excess false source rate. V/hile this control 
cou'ld be done automatically, the software required would be 
complex and consumptive of processing time. By allovn'ng 
qualitative decisions -co be made externally, the most complex 
decisions are removed from the software requirements. All 
that the monitor programs would have to do is provide enough 
quantitative measures and displays to allow the judgments to be made 
accurately. Since each stage of the processing reduces the 
size of the data base, the need for real-time interaction 
fades; it becomes feasible to rerun a processing step in the 
back-end phase v/hen difficulties are encountered where this 
would have hampered processing severely during the initial 
phases . 

5-2 Front-End Processing Flow 

The data pre-conditioning and front-end detection phases and 
their monitors comprise the front-end processor The inputs 
to this group are the raw survey data and the pointing 
ephemeris, and the outputs include tapes of the extended source 
data blocks, noise records, and t\>/o groups of detected sources » 

The detections are separated into categories which can be j'udged 
solely on their individual signatures as false signals, such as 
dust and spikes, and real signals from potential stars. Also 
output are the summaries of the monitor and control functions 
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and any records of housekeeping data from the raw data tapes. 
Hardv/are implementation of the front-end phases can be used to 
minimize the stretch of processing time over data gathering 
time; software approach can also be used in whole or in part 
saving hardware costs but probably increasing computing time. 

The pre-conditioning task converts the packed integer telemetry 
data into usable form. The data are unpacked and grouped as 
streams of samples from each detector and each housekeeping 
function, the voltage compression is inverted and offsets re- 
moved, and the initial data monitoring task is performed. 

This includes the tracking of record gaps and any operational 
variations indicated in the housekeeping (such as a detector 
turned off). If the PCM digitizing system produced a data 
quality measure (typically telemetry signal strength), this is 
monitored for interactive decisions. Preliminary calculations 
of the noise are done for each block of data, time tags are 
calculated, and the data passes to the front-end detection 
phase. Interactive monitoring of this phase allov/s judgment 
of the quality of the digital records so that bad tapes or 
inadequate telemetry can be discovered as early as possible. 

The front-end detector performs the first complex calculations 
on the data; its associated monitor routines produce the 
earliest judgment on the sensor's performance, and a quality 
measure of the survey data. Side calculations from the sensor 
pointing ephemens determine the time boundaries of 
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desired extended source data, and the raw data for those areas 
are written on the extended soui^ce tapes. The data streams from 
each channel are transformed to optimally band! imi ted and 
correlated sequences. These three sequences are processed by 
the detection routines, and data passing the detection screens 
are measured. These sources are written on either the false 
source or potential star tapes for access by the later processing 
stages. The remaining noise calculations are made and written 
on the noise record with enough data to determine why a source 
was not detected at a particular time if it is detected later in 
the same spatial position. The remaining front-end monitor 
functions produce noise spectra for analysis, summaries of the 
false sources, status of the self-adapting detector routines, 
and possibly sample plots of the raw or transformed data for 
visual study. The control function here allows a statistical 
evaluation of the survey instrument performance, and records 
the judgment of quality or confidence to be used during later 
processing stages. Various levels of noise analysis are 
performed using both the raw data and the transformed sequences 
to generate noise frequency spectra and other analyses for 
occasional study. The spectra should include data containing 
real sources, false sources, and noise only and at various 
signal levels for the first two so that complete understanding 
of the data will be available. 

5.3 Back-End Cascade 

The middle processing phase is called a cascade because of the 
waterfall -like effect of the data flow. As more data is 
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gathered by the sensor and observations are repeated at various 
levels of redundancy, the sources move from the rav/ master 
source file of Figure 8 to the final data base. Each cascade 
level monitors the planned and the actually executed observing 
schedule qualifying sources for the next level as sufficient 
data IS gathered. Both moving sources and fixed position 
variable sources must be accounted for so that in addition to 
reducing the data base, a number of auxiliary data bases are 
generated. Interactive processing of the data is less needed 
here, but the status reporting function of the supervisor 
programs increases. 

The organized sequence of the back-end cascade and its computations 
were discussed in detail in Section 3.0. The most important 
addition to the back-end routines in the overall system is the 
monitor and status programs. Each cascade level must be 
monitored since the gate adjustment will best be done by 
qualitative analysis of the output. Naturally, re-processing 
of some gate levels will be needed, and a means of saving the 
discards of each cascade will save time if that step must be 
redone. However, this requires a significant amount of redundant 
storage since the entire body of data will end up being saved 
three or four times. Scheduling of the cascade processing can 
reduce this storage overhead if attention is paid to the repeat- 
observation schedule of the sensor. Knowledge of data quality 
variations which was generated in the front-end phase also 
reduces the re-processing requirements once the functional 
effect of gate level variation is understood for each gate. 
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Figure 8. Back-End Cascade Structure 
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The final products generation can be considered the last step 
of the cascade sequence. Rather than further deleting unqualified 
data from the master data base, however, this processor phase 
subdivides the data into desired categories. For example, a 
master catalog of stars observed is commonly produced using the 
best estimate values for position and brightness and computing the 
correlations of this catalog with other source catalogs. Sub- 
classes of this catalog may list the observation sequence and 
possible parameters of variable sources, an extended source 
catalog, or lists of sources with specified spectral characteristics. 
Monitor functions of this final step describe the completeness 
of the several catalogs as the survey processing progresses. 

5.4 Extended Sources and Survey Calibration 

As discussed in Section 4.0, the techniques needed for extended 
source processing are significantly different from point- 
source procedures, a fact determined largely by the difference 
in final products. To produce large-scale maps of these regions 
the raw data from many scans must be combined and transformed to 
an array of D.C.-like brightness values. This array is then 
transformed into a graphic image or contour map of appropriate 
scale. Routines which remove point sources from the data may be 
desired, and other routines which put them back on the maps may 
also be needed. For smaller extended sources, integration of the 
total brightness might be performed and the source included in 
the master catalog with an indication of the size of the region. 

The software routines, for producing contour maps are readily 
available. Similarly, routines for producing photo images can 
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be modified to suit the resolution capability of the survey 
instrument. 

Calibration of the star survey is another major problem which 
falls outside of the main data processing flow. Only in 
a carefully designed observing program can the calibration be 
done completely separate from the survey itself. For example, 
the IRAS mission i,s planned to observe a small set of 
standard stars once or twice per orbit; all measurements 
during the following orbit would then be calibrated by these 
measurements. In other survey programs, however, the standard 
stars and the survey itsolf were mixed together on every scan 
with some of the observations of known stars being called 
"standards", the others "unknowns." The voltage measurements of 
these standard stars are fit to their defined brightness in a least 
square sense to produce calibration factors for the detectors. 
Monitoring of these standards must be done at all levels of 
the processing scheme so that any long-term variations in the 
calibration can be discovered and so that the final catalog 
values are truly best estimates of the actual source brightness. 
Difficulties arise with this technique when the system 
responsivity varies during the survey since no single calibration 
star is normally observed often enough to monitor the variations. 
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